AITopics | conceptual question

Collaborating Authors

conceptual question

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FinanceQA: A Benchmark for Evaluating Financial Analysis Capabilities of Large Language Models

Mateega, Spencer, Georgescu, Carlos, Tang, Danny

arXiv.org Artificial IntelligenceJan-29-2025

FinanceQA is a testing suite that evaluates LLMs' performance on complex numerical financial analysis tasks that mirror real-world investment work. Despite recent advances, current LLMs fail to meet the strict accuracy requirements of financial institutions, with models failing approximately 60% of realistic tasks that mimic on-the-job analyses at hedge funds, private equity firms, investment banks, and other financial institutions. The primary challenges include hand-spreading metrics, adhering to standard accounting and corporate valuation conventions, and performing analysis under incomplete information - particularly in multi-step tasks requiring assumption generation. This performance gap highlights the disconnect between existing LLM capabilities and the demands of professional financial analysis that are inadequately tested by current testing architectures. Results show that higher-quality training data is needed to support such tasks, which we experiment with using OpenAI's fine-tuning API.

calculation, financeqa, information, (15 more...)

arXiv.org Artificial Intelligence

2501.18062

Country: North America > United States (0.28)

Genre:

Financial News (0.95)
Research Report > New Finding (0.48)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Decomposed Prompting to Answer Questions on a Course Discussion Board

Jaipersaud, Brandon, Zhang, Paul, Ba, Jimmy, Petersen, Andrew, Zhang, Lisa, Zhang, Michael R.

arXiv.org Artificial IntelligenceJul-30-2024

We propose and evaluate a question-answering system that uses decomposed prompting to classify and answer student questions on a course discussion board. Our system uses a large language model (LLM) to classify questions into one of four types: conceptual, homework, logistics, and not answerable. This enables us to employ a different strategy for answering questions that fall under different types. Using a variant of GPT-3, we achieve $81\%$ classification accuracy. We discuss our system's performance on answering conceptual questions from a machine learning course and various failure modes.

accuracy, conceptual question, student question, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-36336-8_33

2407.2117

Country:

North America > Canada > Ontario > Toronto (0.16)
North America > Canada > Manitoba > Westman Region > Brandon (0.04)
Europe > United Kingdom > England > Durham > Durham (0.04)

Genre:

Instructional Material (0.70)
Research Report (0.50)

Industry: Education > Educational Setting (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Using Large Language Models for Cybersecurity Capture-The-Flag Challenges and Certification Questions

Tann, Wesley, Liu, Yuancheng, Sim, Jun Heng, Seah, Choon Meng, Chang, Ee-Chien

arXiv.org Artificial IntelligenceAug-20-2023

The assessment of cybersecurity Capture-The-Flag (CTF) exercises involves participants finding text strings or ``flags'' by exploiting system vulnerabilities. Large Language Models (LLMs) are natural-language models trained on vast amounts of words to understand and generate text; they can perform well on many CTF challenges. Such LLMs are freely available to students. In the context of CTF exercises in the classroom, this raises concerns about academic integrity. Educators must understand LLMs' capabilities to modify their teaching to accommodate generative AI assistance. This research investigates the effectiveness of LLMs, particularly in the realm of CTF challenges and questions. Here we evaluate three popular LLMs, OpenAI ChatGPT, Google Bard, and Microsoft Bing. First, we assess the LLMs' question-answering performance on five Cisco certifications with varying difficulty levels. Next, we qualitatively study the LLMs' abilities in solving CTF challenges to understand their limitations. We report on the experience of using the LLMs for seven test cases in all five types of CTF challenges. In addition, we demonstrate how jailbreak prompts can bypass and break LLMs' ethical safeguards. The paper concludes by discussing LLM's impact on CTF exercises and its implications.

ctf challenge, llm, participant, (14 more...)

arXiv.org Artificial Intelligence

2308.10443

Country: Asia > Singapore > Central Region > Singapore (0.05)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)
Government > Military > Cyberwarfare (0.73)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.57)

Add feedback